skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Yu, Dongxiao"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The emerging Ray-tracing cores on GPUs have been repurposed for non-ray-tracing tasks by researchers recently. In this paper, we explore the benefits and effectiveness of executing graph algorithms on RT cores. We re-design breadth-first search and triangle counting on the new hardware as graph algorithm representatives. Our implementations focus on how to convert the graph operations to bounding volume hierarchy construction and ray generation, which are computational paradigms specific to ray tracing. We evaluate our RT-based methods on a wide range of real-world datasets. The results do not show the advantage of the RT-based methods over CUDA-based methods. We extend the experiments to the set intersection workload on synthesized datasets, and the RT-based method shows superior performance when the skew ratio is high. By carefully comparing the RT-based and CUDA-based binary search, we discover that RT cores are more efficient at searching for elements, but this comes with a constant and non-trivial overhead of the execution pipeline. Furthermore, the overhead of BVH construction is substantially higher than sorting on CUDA cores for large datasets. Our case studies unveil several rules of adapting graph algorithms to ray-tracing cores that might benefit future evolution of the emerging hardware towards general-computing tasks. 
    more » « less
    Free, publicly-accessible full text available May 27, 2026
  2. Fixed-point decimal operations in databases with arbitrary-precision arithmetic refer to the ability to store and operate decimal fraction numbers with an arbitrary length of digits. This type of operation has become a requirement for many applications, including scientific databases, financial data processing, geometric data processing, and cryptography. However, the state-of-the-art fixed-point decimal technology either provides high performance for low-precision operations or supports arbitrary-precision arithmetic operations at low performance. In this paper, we present a design and implementation of a framework called UltraPrecise which supports arbitraryprecision arithmetic for databases on GPU, aiming to gain high performance for arbitrary-precision arithmetic operations. We build our framework based on the just-in-time compilation technique and optimize its performance via data representation design, PTX acceleration, and expression scheduling. UltraPrecise achieves comparable performance to other high-performance databases for low-precision arithmetic operations. For highprecision, we show that UltraPrecise consistently outperforms existing databases by two orders of magnitude, including workloads of RSA encryption and trigonometric function approximation. 
    more » « less